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What is claimed is: 

A computer system comprising: 
a processor; and 

a storage device coupled to the processor and having stored therein an instruction, 
which when executed by the processor, causes the processor to at least, 

access a packed data operand having at least two portions of data elements; 

select a set of data elements from a portion of the packed data operand, the portion 
including at least two data elements; 

copy each data element of the selected set of data elements to specified data fields 
located in the corresponding portion of a destination operand. 

2. The computer system of claim 1, wherein the packed data operand includes 
eight data elements and the processor selects a set of data elements from one of either the 
upper half or the lower half of the packed data operand. 

3. The computer system of claim 1, wherein the storage device further 
comprises a packing device for packing integer data into the data elements. 

4. The computer system of claim 1, wherein the data elements are 16-bit data 
elements and wherein the data packed and destination operands are each 128-bit operands. 

5. The computer system of claim 1, wherein the data packed and destination 
operands are the same operand. 
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A computer-implemented method comprising: 



decoding a single instruction; 

in response to decoding the single instruction, accessing a packed data operand 
including at least two portions of data elements; 

selecting a set of data elements from a portion of the packed data operand, the 
portion including at least two data elements; 

copying each data element of the selected set of data elements to specified data 
fields located in the corresponding portion of a destination operand. 

7. The method of claim 6, wherein accessing a packed data operand including 
eight data elements and selecting a set of data elements from one of either the upper half or 
the lower half of the packed data operand. 

8. The method of claim 6, further comprising packing integer data into the data 



and wherein the data packed and destination operands are each 128-bit operands. 

10. The method of claim 6, wherein the data packed and destination operands are 
the same operand. 



A computer-implemented method comprising: 
accessing data representative of a first three-dimensional image; 
altering the data using three-dimensional geometry to generate a second three- 
dimensional image, the method of altering at least including, 



elements. 



9. 



The method of claim 6, wherein the data elements are 16-bit data elements 
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accessing a packed data operand having at least two portions of data elements; 

selecting a set of data elements from a portion of the packed data operand, the 
portion including at least two data elements; 

copying each data element of the selected set of data elements to specified data 
fields located in the corresponding portion of a destination operand; and 

displaying the second three-dimensional image. 

12. The method of claim 11, wherein the method of altering includes the 
performance of a three-dimensional transformation. 

13. The method of claim 11, wherein accessing a packed data operand including 
eight data elements and selecting a set of data elements from one of either the upper half or 
the lower half of the packed data operand. 

14. The method of claim 11, wherein the method of altering includes packing 
integer data into the data elements. 

15. The method of claim 11, wherein the data elements are 16-bit data elements 
and wherein the data packed and destination operands are each 128-bit operands; 

16. The method of claim 11, wherein the data packed and destination operands 
are the same operand. 



^ff. A method for shuffling packed data elements comprising: 

decoding a single instruction specifying a source operand, a destination operand, and 
a field of control bits; 
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responsive to the single instruction and the field of control bits, generating a first 
portion of the destination operand comprised of data elements from the same portion of the 
source operand. 

18. The method of claim 17, wherein the portion is one of either the upper half 
or the lower half of the source and destination operands. 

19. The method of claim 17, wherein the source operand and the destination 
operand are the same operand. 




A processor comprising: 



a decoder to decode at least one single instruction specifying a source and 
destination operands, and a field of control bits; and 

an execution unit, responsive to the field of control bits, to generate a first portion of 
the destination operand comprised of data elements from the same portion of the source 
operand. 

21. The processor of claim 20, wherein the portion is one of either the upper half 
or the lower half of the source and destination operands. 

22. The processor of claim 20, wherein the decoder is to decode a first 
instruction to shuffle 16-bit data elements from a first source operand of 128 bits to a first 
destination operand of 128 bits, a second instruction to shuffle data elements from the upper 
half of a second source operand to the upper half of a second destination operand, and a 
third instruction to shuffle data elements from the lower half of a third source operand to the 
lower half of a third destination operand. 
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A program loaded into a computer readable medium comprising: 
a computer readable code to access a packed data operand having at least two 
portions of data elements; 

a computer readable code to select a portion of the packed data operand, the portion 
including at least two data elements; 

a computer readable code to select a set of data elements from the selected portion; 
a computer readable code to copy each data element of the selected set of data 
elements to specified data fields located in the corresponding portion of a destination 
operand. 

24. The program of claim 23, wherein the source operand and the destination 
operand are the same operand. 

25. The program of claim 23, wherein m = 2 and the computer readable code to 
select a set of data elements selects one of either the upper half or the lower half of the 
packed data operand. 

^ processor-implemented method comprising: 

decoding a single instruction specifying a source operand of 128 bits, a destination 
operand of 128 bits, and a control word of eight bits; 

responsive to the single instruction and the control word, shuffling 16-bit data 
elements from the source operand to the destination operand. 

27. The method of claim 26, wherein the source operand and the destination 
operand are the same operand. 
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A processor comprising: 



a decoder to decode: 

a first instruction specifying a first source operand of 128 bits, a first 
destination operand of 128 bits, and a first control word of eight bits; 



destination operand of 128 bits, and a third control word of eight bits; and 

an execution unit, responsive to the first instruction and the first control word, to 
shuffle 16-bit data elements from the first source operand to the first destination operand; 
responsive to the second instruction and the second control word, to shuffle data elements 
from the upper half of the second source operand to the upper half of the second destination 
operand; responsive to the third instruction and the third control word, to shuffle data 
elements from the lower half of the third source operand to the lower half of the third 
destination operand. 

29. The processor of claim28, wherein a source operand and a destination 
operand are the same operand. 

30. The processor of claim 28, wherein the processor is comprised of either or 
both hardware and software components. 



a second instruction specifying a second source operand of 128 bits, a second 



destination operand of 128 bits, and a second control word of eight bits; 



a third instruction specifying a third source operand of 128 bits, a third 
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